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Description 

This invention relates to a method for distinguish- 
ing a first chemical compound from a second chemi- 
cal compound on the basis of chromatographic data 
wherein said chemical compounds absorb ultraviolet 
radiation, according to the prior art portion of claim 1 . 
(see Paper No. 892, presented by H.-J.P. Sievert at 
the 39th Pittsburgh Conference and Exhibition, 
1988). 

There can be little doubt that mixtures of chemi- 
cal compounds have achieved great importance in 
modern society. The nature and operation of such 
mixtures are of frequent concern in fields such as 
agriculture, manufacturing, scientific research, and 
medicine. Indeed, the human body could scarcely 
function in the at)senoe of chemical mbctures. Accord- 
ingly, it is frequently an object of medicine and other 
arts to determine the identity and concentration of the 
components In chemical mixtures found, for example, 
in the human body or other chemical reaction sys- 
tems. Analysis of this sort finds numerous applica- 
tions and provides the primary basis for a wide variety 
of product quality control programs and medical diag- 
nostic techniques. 

Probably the most common method for analyzing 
a mixture of one or more chemical compounds entails 
isolating and then characterizing each compound. 
Chromatography provides one means for effecting 
such isolation. In virtually all chromatographic sepa- 
rations, a mobile phase comprising a mixture of 
chemical compounds passes through a stationary 
bulk phase. Gas and liquid chromatography provide 
examples of techniques in which gases and liquids, 
respectively, are employed as the mobile phase. A 
number of variations on both gas and liquid chroma- 
tography are known in the art. The choice of a given 
variatfon depends intimately upon the partknjiar sep- 
aratton to be performed. For example, high- 
performance liquki chromatography (HPLC), a tech- 
nique in which a liquid mobile phase is passed 
through the stattonary phase under the Influence of 
high pressure, finds particular use in the separation 
and analysis of difficultly separated compounds hav- 
ing relatively high nDolecular weight. 

Compounds separated by HPLC or other types of 
chromatography are generally then passed through a 
detector responsive to one or more of the compounds. 
Flame ionization, thermal conductivity, and uKravio- 
let<UV}/visible devices provide examples of common- 
ly-employed detectors. As will be appreciated by 
those skilled in the art, ultravk>let detectors measure 
the degree to which a given chemical species absorbs 
electromagnetic radiatbn having wavelength be- 
tween about 200 and at>out 400 nanonrteters (nm). 
Those of skill in the art will also recognize that a de- 
tector's positive response to a chemical compound is 
commonly referred to as a peak. A detector's re- 



sponse to each isolated component of a chemical mix- 
ture is often recorded, such as on paper or magnetic 
media. A recorded sequential assemblage of peaks is 
known in the art as a chromatogram. 

5 A mixture of chenrtlcal compounds will commonly 

produce a chromatogram somewhat characteristic of 
that mixture. However, the particular chromatogram 
produced by a given chemical mbcture will t)e greatly 
dependent upon the condittons under which sakl 

10 chromatogram Is generated. As will be appreciated by 
those skilled in the art, factors which may influence a 
chromatogram include the solvent employed as an 
eluent, the pressure employed in the chromatograph- 
ic system, the type of stationary phase used, and the 

15 nature of chromatographic apparatus itself. 

Because a chromatogram is to a certain degree 
characteristic of a mixture of chemical compounds, 
chromatograms are often compared In order to distin- 
guish one such mbcture from another. For example, 

20 retention times derived from a chromatogram provide 
one basis for such distinction. Retention times repre- 
sent the time intervals required for the lsolatk>n and 
detection of the individual chemical components of a 
mixture subjected to chromatographic analysis and 

25 are measured from the start of the analysis. The 
height and area of individual peaks provide additional 
bases for comparison two chemical mixtures. Com- 
parative analysts on the basis of such data will under- 
standably be complex where analyzed mixtures conv 

30 prise many indivMual compounds and will be further 
complicated by variations in the conditions under 
which subject chromatogranrts are generated. Thus, 
the results of such analyses often can only be consid- 
ered unambiguous when combined with other inde- 

35 pendent analytical methods. 

Accordingly, the analysis of chromatographic 
data is frequently combined with or supplanted by 
other techniques. One such technique invoh^es meas- 
uring the response of isolated chemical compounds 

40 upon exposure to one or more frequencies of infrared, 
UV, visible, or other forms of electronragnetic radia- 
tion. It is known, for example, that ultraviolet spectral 
data can provide structural information regarding 
compounds that have been separated on an HPLC 

45 system. Unfortunately, however, the interpretation of 
UV spectral data is often more difficult than interpre- 
tation of, for example, infrared spectral data. This dif- 
ficulty can be compounded by the fact that the ana- 
lysis of spectral data had traditionally been based on 

so visual evaluation and comparison of spectra selected 
during elution of a mbcture. These comparison tech- 
niques for UV spectra traditionally utilized only a few 
points in the spectral profile to validate identif icatton. 
However, the fairly recent introduc:tion of full- 

55 spectrum photo diode-array ultraviolet detectors has 
significantly altered traditional UV spectral analysis. 
Diode-anray specitrophotometers yield on-line speo 
tra and allow rapid collection of spectra over the ul- 



3 



EP 0 437 829 B1 



4 



traviolet and/or visible range in digital form. These in- 
struments, when interfaced with HPLC systenr»s, pro- 
vide a powerful tool for the analysis of complex mix- 
tures that are not amenable to gas chromatography or 
other types of separations. For example, those skilled 
In the art will appreciate that when the composition of 
a liquid chronnatography mobile phase is varied for 
the same chemical mixture, the order In which its con- 
stituent compounds elutefrom a chromatographic ap- 
paratus can and often does change. The order in 
which peaks associated with these compounds are 
recorded will, in turn, correspondingly vary. In order 
to identify peaks of interest it is vital that the peaks be 
tracked as their elution Is varied by the solvent. In 
principle, the use of a diode-array detector can pro- 
vkJe this facility. 

Dk>de-array ultraviolet detection, however, is not 
without its limitations. For example, peaks can and of- 
ten do overlap and respective UV spectra are some- 
times insufficiently different to provide unique klen- 
tification. In addition, because diode-array detectors 
commonly generate large amounts of information 
from a single chromatographic analysts, manual and 
interactive data reduction methods can prove time 
consuming and are often incomplete and imprecise. 
Consequently, the development of diode-array devic- 
es has hastened the development of mathematical 
techniques for analyzing UV spectral data. Such 
mathematical methods can be used to extend the use 
of diode-array data by the deconvolution of peaks and 
by using pattern recognition technkjues. 

Thus, a great deal of attention in the art has been 
directed to the implementation of diode-array UV de- 
tectors in the analysis of chemical compounds and 
mixtures of chemk^al compounds. The goal of nearly 
all such techniques has been to determine the identity 
of an unknown compound by comparing its spectral 
data against vast libraries of similar data for known 
compounds. Identif icatton techniques following this 
format are known as forward searches. 

It would be of great utility, however, to also per- 
form reverse searches of spectral data to identify a 
predetermined number of known components that 
are expected to be present in an unknown sample or 
to distinguish dissimilar compounds or mixtures. Re- 
verse search spectral analysis could be employed in 
areas such as the quality control of manufactured 
chemicals where it is required that certain compo- 
nents tie present in a given sample and the presence 
of addittonal components is undesirable, even critical. 

SUMMARY OF THE INVENTION: 

It is an object of this invention to provide a method 
and apparatus for distinguishing two mixtures of 
chemical compounds. 

Another object of this invention is to provide a 
method and apparatus for distinguishing two mix- 



tures of chemical compounds on the basis of chroma- 
tographic data. 

Yet another object of this invention is to provide 
a method and apparatus for distinguishing two mix- 
5 tures of chemical compounds on the basis of spectral 
data. 

Still another object of this invention is to provide 
a method and apparatus for distinguishing two such 
mixtures by isolating and comparing their respective 

10 constituent chemical compounds. 

It is a further object of this inventton to provide a 
method and apparatus for distinguishing two chemh 
cal compounds on the basis of chromatographic and 
UV spectral data. 

15 Accordingly, this invention provides a method 

and apparatus for distinguishing a first mixture of 
chemical compounds from a second mbdure of chenv 
icat compounds by analyzing chromatographic and 
spectrophotometric data associated with chemical 

20 compounds isolated from the mixtures. The method 
and apparatus provide spectral match factors and 
peak scores which correlate the chemical com- 
pounds. These match factors and peak scores are 
then employed in calculating sample scores indica- 

25 tive of the similarities between the mixtures. 

In a preferred embodiment, the method compris- 
es the steps of isolating the chemteal compounds of 
the first and second mixtures using chromatography; 
exposing each isolated chemical compound one or 

30 more times to one or more selected wavelengths of 
ultraviolet radiation; and recording the respective at>- 
sorbances of the isolated chemical compounds upon 
each exposure to the ultraviolet radiation. The re- 
spective absorbances of the isolated chemical conn- 

35 pounds are then provided to processing means as a 
first data set Further steps performed by the proc- 
essing means include provkiing at least one general 
match factor by applying a general matching function 
to the first data set; providing respective average at>- 

40 sort>ances for the isolated chemteal compounds at 
each selected wavelength by applying an averaging 
function to the first data set; providing automatch fac- 
tors by applying an autonnatching function to the first 
data set and to the average absorbances; providing 

45 crossmatch factors by applying a crossmatching 
function to the first data set and to the average absor- 
bances; and providing nnatch discriminaters by apply- 
ing a match discrimination functk>n to the general 
match factors. A second data set is then provkled to 

50 the processing means, said second data set compris- 
ing the respective retention times, peak areas, and 
peak heights for the isolated chemk:al compounds. 
Subsequent steps performed by the processing 
means include providing retention deviations by ap- 

55 plying a retention deviation function to the second 
data set; providing peak area deviatk)ns by applying 
a peak area deviation function to the second data set; 
providing peak height deviations by applying a peak 
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height deviation function to the second data set; pro- 
viding area and height deviations by applying an area 
and height deviation function to the peak area devia- 
tions and the peak height discriminaters; assigning 
peaks by applying a hierarchical assignment proce- 
dure; providing at least one peak score for the isolated 
chemical compounds by applying a peak scoring 
function to the match deviations, retention deviations, 
and area and height deviations; providing at least one 
sample score by applying, via the processing means, 
a sample scoring function to the peak scores; and dis- 
tinguishing the first mixture of chemical compounds 
from the second mixture of chemical compounds on 
the basis of at least one sample score. 

BRIEF DESCRIPTION OF THE DRAWINGS: 

The numerous objects and advantages of the 
present invention may be better understood by those 
skilled In the art by reference to the accompanying 
figures of which: 

Figures 1a-c provide two wavelength-shifted at>- 
sorbance plots for the same chemical compound and 
a plot of general match factor versus wavelength. The 
figures illustrate wavelength shift and its correction 
by analysis of general match factors. 

Figure 2 is an HPLC chromatogram of r-hGH 
separated with gradient 1. 

Figures 3a-c illustrate moderate spectral match 
between two tryptic peptides. Figure 3a shows the 
UV spectra for the two peptides, Figure 3b shows the 
distribution arising from plotting pairwise absorbance 
values for both peptides at identical wavelengths, and 
Figure 3c shows a comparison of the match factors for 
ail spectra for the two peptides. 

Figures 4a-c illustrate strong spectral match be- 
tween two tryptic peptkies. Figure 4a shows the UV 
spectra for the two peptkies. Figure 4b shows the dls- 
tributk>n arising from plotting pairwise at>sorbance 
values for both peptides at identical wavelengths, and 
Figure 4c shows a comparison of the match factors for 
all of the spectra for the two peptides. 

Figures 5a and 5b illustrate background correct 
tion for peak spectra for a tryptic peptide. Figure 5a 
shows the comparison of uncorrected upslope, down- 
slope and apex spectra for the peptide peak with a 
standard spectrum. Figure 5b presents the same 
spectra after background correctton had been ap- 
plied. 

Figure 6 illustrates reproducibility of the tryptic 
map analyzed with gradient 11. The figure shows the 
superimpositton of four replicate elution profiles. 

Rgure 7 provides a table of standard deviattons 
for retention time, peak area, peak height, and match 
factor for tryptic digests from r-hGH. 

Figure 8 provides tat>les illustrating the similarity 
between replicate samples of (a) tryptic digests from 
r-hGH analyzed with gradient I and of (b) native and 



oxidized tryptk: digests from r-hGH analyzed with 
gradient II. 

Figure 9 is an HPLC chromatogram of the tryptic 
map for oxidized r-hGH analyzed with gradient II. The 
5 etutk>n positk)n for the unoxidized peptides is indicat- 
ed by arrows. 

Figure 10 is a flowchart illustrating the Make- 
Library subprogram. 

Figure 11 is a flowchart illustrating the Compare- 
10 Libs subprogram. 

Figure 12 is a flowchart illustrating the Make-Std- 
Ltbrary subprogram. 

Figure 1 3 is a flowchart illustrating the Get-Sam- 
ple-Score program. 

IS 

DESCRIPTION OF THE PREFERRED 
EMBODIMENTS: 

The principles and methods of the present Inven- 

20 tlon are applicable to a number of situations relating 
to the comparison of individual chemical compounds 
and mixtures from which they nr^y be derived. Thus, 
it will be appreciated that the present invention may 
be practiced in situations where the identities of both 

25 of the compared species are unknown or, preferably, 
in situations where the identity of one species is well- 
known and that of the other is unknown. It is particu- 
larly preferred that a library of calibration data for one 
species be available. It is also preferred that chroma- 

30 tographic data relating to both compared species t>e 
availat)le. Chromatographic data Includes retention 
times, peak areas, and peak heights. 

In accordance with the present invention, mix- 
tures of chemical compounds are first isolated into 

35 their respective components. A preferred means of 
effecting such isolation is through the employment of 
chromatography. Any form of chromatography might 
conceivably be employed in the practice of this inven- 
tion, although Tiquid chromatography is preferred. It is 

40 particularly preferred that high-perfomr^nce liqukJ 
chromatography (HPLC) be employed in isolating the 
chemical compounds of mixtures to be analyzed in 
accordance with the present invention. 

Once isolated, chemical compounds are exposed 

45 one or more times to one or more selected wave- 
lengths of ultraviolet radiation and the respective ab>- 
sort>ances of the isolated chemical compounds upon 
each exposure is recorded. Those skilled in the art 
will appreciate that the reliability of data derived from 

50 such exposure wi II increase with the number of ti mes 
such exposure is effected and with the number of wa- 
velengths employed. 

Once recorded, such data is provided to process- 
Ing means. Processing means amenable to the prac- 

55 tice of this invention consist of a computing device 
such as the HP9000 Series 300 Pascal Workstation 
or any equivalent computing device capable of com- 
piling and executing instructions. These instructions 
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should be provided in a programnning language such 
as Pascal or any equivalent thereof capable of imple- 
menting the algorithms of this invention. Processing 
means further include an input device such as a key- 
board and an output device such as a video display 5 
or printer. Preferred processing means further include 
one or more devices for the storage of data, such as 
magnetic disks or tape. Processing means should 
also comprise a operating system or programming en- 
vironment for the generation of source code in the ap- io 
propriate programming language, along with a conv 
piler or other means of converting such source code 
into executable programs. 

Data may be provkJed to the processing means 
precisely as recorded or may be prepared or pretreat- is 
ed by various means well known to those skilled In the 
art. Examples of such preparation or pretreatment 
are wavelength calibratton, smoothing, and transfor- 
mation of the data, such as fast Fourier transfonm. 

In accordance with the present invention, algo- 20 
rithms implemented by the processing means are pro- 
vided. In certain preferred embodiments, these algo- 
rithms concern the problems encountered In identify- 
ing components in an HPLC separation based on the 
spectral and chromatographic data available from 25 
well characterized calibration standards. One such al- 
gorithm concerns the determlnatnn of spectral match 
factors. Thus, the spectral matching function may be 
defined as: 

MF, = 1000(1 - r2) (1) 30 
where MF, stands for spectral match factor and r is a 
correlation coefficient according to: 

^ _ [(Sxy) - (ZxXSyynf] 

[{2x2 _ (£x)2/nO {2y2 - (Sy)2/nJ)i« ^ ' 
where x and y, respectively, are absorbances taken 35 
from the compared spectra at the same wavelength, 
£ is the summation functk>n, and nf is the number of 
selected wavelengths. It will be understood by those 
skilled In the art that other spectral matching func- 
tions, such as: 40 

MF, = 1000.r2 (3) 
can be employed in the practice of this invention. 

Spectral match factors can range from zero for a 
perfect match to 1 000 for total absence of correlation. 
General match factors, automatch factors, and cross- 45 
match factors provide examples of spectral match 
factors. For example, in determining general match 
factors (MFg), r is the correlation coefficient obtained 
from the correlation between absorbances of individ- 
ual spectra for a first and a second chemical com- so 
pound. 

As will be appreciated by those skilled in the art. 
one problem with the general match factor thus de- 
scribed is the lack of a meaningful limiting value for 
the differentiation between a positive and a negative ^ 
match. Accordingly, one embodiment of the present 
invention provkles such a limit 



After multiple copies of spectra are obtained for 
a first and second chemical compound, the spectral 
match factors for certain selected matches are com- 
pared. For example, the match factors for all matches 
of individual spectra for the first compound are com- 
pared with the average spectrum for that compound. 
In additkin, the match factors for matches of all indi- 
vklual spectra for the second compound are com- 
pared against the corresponding average spectrum 
for that compound. These comparisons of individual 
versus average spectra are known in accordance 
with this invention as automatching functions and the 
match factors so obtained are known as automatch 
factors (Ma). 

In accordance with one embodiment of the pres- 
ent inventk>n, crossmatch factors are next obtained 
by matching: 1 ) all individual spectra for the first com- 
pound against the average spectrum for the second 
compound; and 2) all indh^idual spectra for the sec- 
ond compound against the average spectrum for the 
first compound. The match factors obtained by com- 
paring the individual spectra of one compound with 
the average spectrum for the other compound are 
known as crossmatch factors (MJ. 

The well-known Student's t-test is employed in 
analyzing the results from automatching and cross- 
matching. Applk:ation of the t-test in this invention 
yields a difference (D) between the mean values for 
the automatch factors and the crossmatch factors. 
The t-test also provides a probability that this D-value 
is significant, i.e. that the two means are different. 
Where these means are different, the first and sec- 
ond compounds can reliably be said to represent dif- 
ferent species. 

In accordance with one embodiment, a match 
discrimination function may also be defined as fol- 
lows: 

MTdb = D/r(DF,prob) (4) 
where MTdj, is match discriminator. D is the differ- 
ence for the mean match factor derived from the au- 
tomatching and crossmatching functions, DF is the 
degrees of freedom which are calculated from the 
number of individual spectra for the first and second 
compounds, and T(DF.prob) is the t-value required for 
a desired degree of probability (prob, in %) that two 
means differing by that t-value are different given the 
degrees of freedom applicable. It is preferred that the 
degree of probability be 99%. As will be appredated 
by those skilled in the art. MTfH, depends on a number 
of factors, such as the number of spectral data points 
employed, the noise present in the individual spectra, 
any pretreatment applied to the spectra before 
matching and. of course, the degree of similarity t>e- 
tween the two compounds compared. Of course, 
where MT^b 'S equal to one (1) the actual probability 
that the first and second compounds are different will 
be equal to the desired probability. Where MTdi. is 
less than one (1), the actual probability will be less 
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than the desired probability; where MT^ Is greater 
than one (1), the actual probability will be greater 
than the desired probability. 

In accordance with this Invention, it is further in- 
tended that a fixed MTdb be derived for a given stan- s 
dard. Such derivation will permit testing for the signif- 
icance of an individual match between a standard and 
an unknown spectrum without the need for a conrv 
plete statistical analysis. 

Where a given spectral match factor is not equal io 
to zero, another value indicative of the quality of that 
match factor can be obtained by analysis of the resid- 
ual resulting from the correlation between two spec- 
tra. It will be appreciated by those skilled in the art 
that if a "best-fit" regression line is calculated for the is 
correlation between any two spectra X and Y such 
that as one attempts to predict absorbance values for 
spectrum Y from the correlated value of spectrum X 
for each wavelength recorded, then the residual at 
each wavelength is a positive or negative difference 20 
between the actual absorbance of spectrum Y and 
the value of specimen Y as predicted from the corre- 
lation with spectrum X. When studied as a function of 
increasing wavelength, residuals tend to fluctuate 
alx>ve or below zero (0). 2S 

If the two spectra differ in a systematic fashion, 
the residuals will tend to migrate across the regres- 
sion line only slowly. If, on the other hand, the resid- 
uals are distritnjted around regression line in a ran- 
dom fashion, that same match factor might still indi- 30 
cate spectral nrtatch. obscured only by noise. Thus, in 
accordance with one embodiment of this Invention, a 
crossover number (CN) is defined as follows: 

CN = C/(N - 1) (5) 
where C is the number of times the residuals change 35 
sign when sorted by increasing wavelength and N is 
the number of spectral data points used for the match. 
It will be understood that the maximum value for CN 
is one (1) and that CN can never quite reach Zero (0). 
Higher values for CN will indicate a likelihood that the 40 
deviation from zero (0) for a given spectral match fac- 
tor Is due to random noise and not to systematic dif- 
ferences in the spectra compared. It will also be ap- 
preciated by those skilled In the art that the crossover 
numbers described can also be derived if spectra X 45 
and Y are exchanged, in this manner, one might ob- 
tain slightly different values which nonetheless exhifc>- 
it the same characteristics. 

Since the correlation procedure employs absor- 
t>ance values at identical wavelengths, the compari- so 
son of spectra having an error in wavelength can lead 
to erroneous match factors. Thus, it is particularly 
preferred in determining spectral match factors that 
the wavelength assignment for the two spectra com- 
pared be accurate. One means for providing accurate ss 
wavelength assignments is by acquiring spectra for 
the same standard under conditions — such as mobile 
phase, column, hardware calibration, and Instrument 



—identical to those employed in obtaining the two 
spectra in question. Such acquisition might be ach- 
ieved by use of an internal standard. 

Standard spectra thus acquired can then be used 
to calibrate other, related spectra. As will t>e appreci- 
ated by those skilled in the art the acquired standard 
spectra can be used to experimentally determine the 
difference In wavelength assignment by analyzing 
the spectral match factors for the two standard spec- 
tra as a function of a fractk)nal wavelength shift to the 
left or right of one spectrum against the other. As 
shown in Figure 1, the maximum match factor should 
be obtained at a wavelength shift necessary to cor- 
rect for any wavelength Inaccuracy between the two 
unknown spectra. White each UV absorbance can be 
utilized at its nominal, absolute value, correlatton can 
optionally be perfomned In accordance with one em- 
bodiment of this Invention by inversely weighting 
each absorbance value by the variance known to t>e 
associated with the wavelength at which It was ok>- 
tained. Such procedure could Improve the reproduci- 
bility of the matching process of weighing less heavily 
those regions of the spectrum known to be unreliable. 

It will be appreciated by those skilled In the art 
that chemteal compounds can be distinguished for 
certain purposes by employing general match factors, 
automatch factors, and crossmatch factors individual- 
ly or In conjunction with one another. For example, 
general match factor alone will sometimes be suffi- 
ciently indicative of the degree of similarity between 
two chemical compounds. In other cases, general 
match factor alone will be Inconclusive and It may 
prove necessary to consider either automatch factor 
or crossmatch factor, or both, to effectively distin- 
guish chemical compounds. 

In certain emt>odiments, the present invention 
also provides a method for analyzing chromatograph- 
ic data, along with UV spectral data, to determine on 
a peak-by-peak basis the best match for a given stan- 
dard in an unknown sample. In this regard, the para- 
meters retention time deviation (RTdev), peak area de- 
viation (ARdev)» peak height deviation (HTdw), and 
area and height deviation (AHdev)are defined by the 
following functions: 

RTd« = IRTi - RTu2lmT„„ (6) 

ARd«, = IaRi - ARal/ARo^ (7) 

HTde, = iHTi - HT2l/HT„„ (8) 
AHctev = (ARde. + HTdevy2 (9) 
where the subscripts 1 and 2. respectively, denote ex- 
pected and actual data or data corresponding to any 
two chemical compounds, and lim indicates an exper- 
imentally or otherwise defined limited of variability for 
the indicated quantities. 

Thus, the provided peak assignment algorithm 
uses a hierarchical procedure which employs the va- 
rious parameters to select peaks corresponding to 
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chemical compounds which are to be paired and fur- 
ther analyzed. In accordance with certain emtxxJi- 
ments, alt unknown candidate peaks for each stan- 
dard inside an optional retention time window are 
ranked by increasing match discriminator. If the can- 
didate peak with the lowest match discriminator and 
the one with the next highest match discriminator dif- 
fer by more than one (1), the one with the lowest 
match discriminator is considered a positive identifi- 
cation. If the difference is less than one (1), retention 
time deviation is considered next such that the peak 
with the lowest retention time deviation is considered 
a positive match if the next highest retention time de- 
viation differs by more than one (1). If analysis of re- 
tentbn time deviation does not provide a statistically 
significant result, the area and height deviation is 
analyzed in a similar fashion. If, at this point, a posi- 
tive identificatk>n has not been reached. CN Is con- 
sklered such that the candidate with the highest CN 
is selected as a match. 

It will be understood that peak assignment be- 
tween standards and unknowns has to be by direc- 
tionally unambiguous; that is, each standard can only 
be matched by one unknown and vice versa. Thus, in 
cases where two different standards are matched by 
the same unknown peak, the priority of standards is 
established in accordance with this invention on the 
basis of the same rules used to determine the best un- 
known nnatched candidate. 

After successful peak assignment, there will be 
a defined, unambiguous relatnnship between the 
peaks in the standard and the unknown such that at 
most one and possibly no peak is assigned for the un- 
known to each peak from the standard. Consequent- 
ly, two possibilities exists for peak assignment (1) all 
peaks in a standard have one peak for the unknown 
assigned to them and the unknown contains zero or 
more extra peaks that do not correspond to any stan- 
dard; or (2) not all peaks in a standard have been as- 
signed unknown peaks and the unknown contains 
zero or more extra peaks that do not correspond to 
any standard. 

In accordance with certain embodiments, a peak 
score (PS) is next calculated for all pairs of success- 
fully assigned peaks as follows: 

PS = [(f„.MTa») + (fr-RTdev) + (faAKv,v)yNF 

(10) 

where f^, fr, and fa are variable weighting factors for 
match discriminator, retention time deviation, and 
area and height deviatk>n, and NF is an empirically 
derived normalization factor, typically three (3), equal 
to the numt>er of parameters employed. 

As a further Indication of confidence in a given 
peak match, the difference in peak score between the 
candidate peak and the next best match can be used. 
It is also possible to reverse the order in which reten- 
tion time deviation, area and height deviation and 
crossover number are used to resolve ambiguous 



matches, or to not include either or all values in the 
comparison. For example, if it is known that the re- 
sponse can vary from sample to sample, it might 
make sense not to use response matching. If, on the 

5 other hand, the same sample is analyzed using dif- 
ferent chromatographic conditions, retentk>n time de- 
viatton might be meaningless and area and height de- 
vlatton could be used in peak tracking. It will t>e ap- 
preciated that such considerations will depend inti- 

10 matety upon each particular analysis and the facts as- 
sociated therewith. 

In one embodiment of the present invention, a 
modification of the algorithms accounts for the pos- 
sibility that a chromatographic peak in the unknown 

15 might actually contain more than one component In 
such emt>odiment, each candidate peak is checked 
for the presence of all the standards occurring in the 
pra-selected retention time window using mutticonrv 
ponent analysis. All but one of the standards are then 

20 subtracted from the unknown spectrum at the con- 
centration determined and the resulting corrected 
spectrum is matched against the remaining stan- 
dards as previously discussed. 

Once peak score has been defined, a sample 

25 score (SS) can be defined as follows: 

SS = [LPS + (pi-EP) + (P2MP)]/N (11) 
where the individual peak scores are summed over all 
standard peaks successfully matched from the un- 
known, EP are extra peaks not present in the stan- 
ce dard and are weighted by factor p^, missing peaks 
(MP) are weighted by a penalty score p2, and N is the 
total number of standard peaks expected. It will be ap- 
preciated by those skilled in the art that sample 
scores for well characterized reference materials can 

35 be analyzed to arrive at reasonable confidence limits 
for sample score. Scores for unknown samples can 
then be compared and their similarity to the standard 
can be indicated by the difference in sample scores. 
While the principles of the present Invention are 

40 described as they apply to chromatograms produced 
by HPLC, it Is intended that the theories and methods 
described herein are equally applicable to chromato- 
gran^ produced by other well known methods, such 
as gas chromatography and lk)uid techniques other 

46 than HPLC, such as capillary zone electrophoresis. 

It is also intended that spectral data amenable to 
the practice of this invention may be derived from ul- 
traviolet, visible, fluorescence, infrared, Raman, 
atomic atisorption, nudear magnetic resonance, and 

50 mass spectroscopic devices. It is preferred that any 
such spectroscopic device provide electromagnetic 
radiation having reproducible wavelength. It is partic- 
ularly preferred that UV instruments be employed, 
due to both the generally highreproducibilityof UVra- 

55 diation and the consistent manner in which absor- 
bance at one UV wavelength relates to absorbances 
at neighboring wavelengths. This is to be contrasted 
with discrete banded spectra encountered, for exam- 
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pie. in nuclear magnetic resonance spectroscopy. 

Additional objects, advantages, and novel fea- 
tures of this invention will become apparent to those 
skilled in the art upon examination of the following ex- 
amples thereof concerning the identification of pep- 
tide fragments from a trypttc digest of recombinant- 
DNA-derived human growth hormone (r-hGH). 

Preparation of Tryptic Digest of r-hGH 

Samples of r-hGH were oxidized by adding 50 ^1 
of chilled perfonmic acid (nine parts 88% formic acid 
and one part 30% hydrogen peroxide) to 1.0 mg r- 
hGH and reacting the mixture for one hour at O^C. 

Samples were digested in a buffer solution con- 
taining 100 mM sodium acetate. 10 mM Tris base and 
1 mM calcium chloride at pH 8.3 at 37**C by addition 
of 1:100 trypsin (trypsin :r-hGH, by weight) at times 
zero and at two hours. Samples were acidified after 
a total of four hours with 100 of phosphoric acid (pH 
less than 3) per milliliter of sample and analyzed di- 
rectly or stored for up to three days at 2-8''C. The di- 
gestion of r-hGH was complete after four hours. 

Separation by HPLC 

HPLC separations were performed using a Hew- 
lett-Packard 1090M HPLC system equipped with a 
DR5 ternary pumping system, an automated injection 
and sampling system, a heated column compartment 
and a diode-array detector, and controlled by an 
HP79994 A ChemStation. 

Two gradient systems were employed for the sep- 
aration of the tryptic fragments. System I used tri- 
f luoroacetic acid (TFA) in water at 0.1 % as solvent A, 
with 0.8% TFA in acetonitrile as solvent B. The gra- 
dient was linear from 0 to 60% 6 between 0 and 120 
minutes at a flow-rate of 1 ml/min with the oven tem- 
perature set at40''C. System H utilized 50 mM sodium 
phosphate in water, pH 2.85, as solvent A; solvent B 
was acetonitrile. The gradient profile was linear from 
0 to 40% B over 120 minutes at a flow-rate of 1 ml/min 
with the oven temperature set to 40°C, For both gra- 
dient systems a 15 cm x 0.46 cm Nudeosil Cis re- 
versed phase column was used with particle size 5 
fun. pore size 100 A, packed by Alltech Associates. 
Figure 2 shows a typical chromatogram of a mixture 
of tryptic peptides derived from an r-hGH reference 
standard analyzed with the TFA gradient system. 

Data Processing 

For all analyses, spectra were acquired at one- 
second intervals over the range from 200 to 350 nm. 
In addition, chromatographic signals were recorded at 
220, 230, 254, 274, 280, and 292 nm with a reference 
wavelength of 350 nm in all cases. Raw data were 
stored on magnetic media and were processed on the 



ChemStation using the built-in spectral library func- 
tions as well as additional evaluation software that 
was written for that purpose using a high-level com- 
mand language available on the ChemStation. 

5 

Spectral Matching 

Numerical point by point comparison of the two 
UV spectra was implemented on ChemStation with 

10 the COMPARE command described in A. Drouen, 
The Compare Command, Information Note, Publica- 
tion Number 12-5952-3725, Hewlett-Packard GmbH, 
Waldbrom, FRG (1987). This comparison is illustrat- 
ed in Figure 4 where spectra for peptides T1 3 and T14 

IS are compared. At each wavelength, absorbance val- 
ues for the two peptide spectra are plotted as abscis- 
sa and ordinate and a linear regression is applied to 
the resulting scatter plot as shown Figure 4b. The 
square of the correlatton coefficient, multiplied by 

20 1 000. is defined as the match factor for the two s peo- 
tra. Those skilled in the art will appreciate that the two 
peptides shown in Figure 4a differ in the nature of the 
aromatic amino acid residue which is phenylalanine 
for T1 3 and tyrosine forT14. Their spectra are clearly 

25 different even on visual comparison and the match 
factor accordingly has a low value of 91 9. 

Figure 3 illustrates how the nr^atch factor is effect- 
ed when T13 was compared with T12, a peptide frag- 
ment which does not contain any aromatic amino ackl 

30 at all. The corresponding spectra are very similar and 
the match factor increases to 997 (Figure 3b), ap- 
proaching the value expected for klentical spectra. 

Compilation of Spectral Calibration Llt)rary 

35 

A library of standard spectra for the various frag- 
ments in the tryptic map of r-hGH was next compiled. 
For this purpose, a reference standard was injected 
four times and analyzed with gradient systems I (TFA 

40 based) and H (phosphate based). Each of the result- 
ing data files was then processed. 

After integration of the signal at 220 nm, apex 
spectra were identified for all integrated peaks. They 
were corrected for solvent background by subtracting 

45 a reference spectrum which was interpolated from 
two base line spectra at either side of the peak. The 
resulting peak spectra were then stored into a library 
file which was referred to as a sample library since it 
contained all spectra characteristic of a given sample. 

50 The two-point reference correction employed 
was especially important in the case of gradient 1 
since TFA undergoes a significant change in spectral 
properties as the acetonitrile concentration is in- 
creased during the course of the gradient elution. Fig- 

55 ure 5 illustrates how the uncorrected upslope, dowrv 
slope, and apex spectra for fragment T9 differ signif- 
icantly from the standard T9 spectrum. After baseline 
correctton. all three spectra matched the standard 
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spectrum dosely. as shown in Figure 5b. 

Next, a retention time window of ±0.5 minutes 
centered on the apex of each peak from the first stan- 
dard was employed to find the spectrum with the best 
match from each of the other three standards. Those 
spectra that were common to all four standards were 
than averaged, normalized, smoothed, and transfer- 
red into a new spectral library file with was named the 
calibration Wbmy. For each peak in the tryptic map, 
this library file contains the UV spectrum and values 
for area, height, retention time, and scaling factor, all 
values were t)ased on averages from the four stan- 
dard runs. 

As discussed in W.S. Hancock, etai. Cold spring 
Harbor Symposium, (1988) p.95, the identity of the 
tryptic fragments had been determined by amino acid 
analysis and fast atom bombardment mass spectro- 
metry. UtMttry entries for peaks eluting prior to the 
first and after the last tryptic fragment, as well as en- 
tries for peaks with area or height below 1% of total 
area or height, were then removed. As had been 
shown in the Hancock reference, most of the minor 
peaks were not related to r-hGH but were nonspecific 
background, presumably derived from trypsin or due 
to other interferences, such as baseline noise or sol- 
vent impurities. 

The final calibration library for the TFA system 
contained 40 entries, 19 of which represented tryptic 
fragments of known identity. The phosphate library in 
its final from consisted of 31 entries. These two cal- 
ibration libraries were used in all subsequent experi- 
ments. 

It should be noted that conrelation of data from 
different standard runs relies heavily on good chro- 
matographic reproducibility. In Figure 6, chromato- 
graphic traces from four replicates analyzed with gra- 
dient 11 are overlaid to denrK>nstrate excellent instru- 
ment performance even towards the end of the gra- 
dient. Statistical analysis of retention time variations 
showed the average standard deviatton for all peaks 
incorporated into the calibration library to be 0.027 
min (1.6 s) and 0.021 min (1.3 s) for gradient system 
I and 11 respectively. 

Determination of Reproducibility and Selectivity of 
the Calibration Library 

Since two key properties of the match factor that 
determine the usefulness of the spectral data incor- 
porated into the calibration library are sensitivity and 
selectivity, it was decided to investigate these prop- 
erties in a systematic fashion in order to obtain some 
quantitative guidelines. Results were obtained using 
gradient I since TFA, when employed as modifier, 
presents a greater challenge for a liquid chromato- 
graph detector and pump than does phosphate. 

Reproducibility of the match factor determines 
the absolute limit for the similarity between any two 
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spectra and thus defines the sensitivity of spectral 
matching. Two spectra can be considered different 
only when mean and standard deviation for the match 
between the two differ significantly from those ot>- 
5 tained by repeatedly matching identical spectra. It is 
not sufficient to use a match factor cutoff as criteria 
for a positive identification. Additnnal statistical infor- 
mation is needed to determine the significance of a 
given match factor. 
10 Spectra for T13 or T14 derived from eleven dif- 

ferent injections were averaged to obtain a represen- 
tative spectrum for each peptide. All individual spec- 
tra were then matched against their respective aver- 
age (automatching, as shown in Figure 4c) and the re- 
is suiting distribution of match factors was compared 
with that obtained from matching individual T13 spec- 
tra against the average T14 spectrum and vtee versa 
(crossmatching, as shown in Figure 4c). It can t>e 
seen that the means for automatch factor and cross- 
20 match factor are quite different; the average value for 
the crossmatch factor of 918.6 is certainly a good in- 
dication of dissimilarity. More importantly, confi- 
dence intervals of three standard deviations above 
and below each mean as indicated in Figure 4c do not 
25 overlap, but show a significant gap. Thus, T1 3 can be 
distinguished from T14 with a great degree of confi- 
dence. 

Figure 3c shows the corresponding plot of auto- 
match factors and crossmatch factore for T13 and 

30 T1 2. These peptides are very similar in their spectral 
characteristics as can be seen by the mean cross- 
match factor score of 997.25. Nonetheless, there is 
still a clear gap between the confidence intervals for 
automatch factor and crossmatch factor, indicating 

35 that It is possible to differentiate between compounds 
of extreme similarity. In statistical terms, if Student's 
/-test is applied to the data in Figure 3c, a f-value of 
57 Is obtained along with a probability of better than 
99.99% that the mean values obtained for automatch 

40 factor and crossmatch factor are indeed different 

The Mest for the comparison for T13 and T14 
(Figure 4c) results in a ^value of 542 and a probability 
of 100.00% that the spectra are different ^Values 
representing the similarity among the four aliphatic 

45 peptkles (T7, T8, Til, and T12) ranged from 13 to 
1 33, which is sufficient for statistically valid distinc- 
tion. It will be appreciated by those skilled in the art 
that for a populatk)n size of 11 , a f-value of at least 6.2 
is required to provide greater than 99.99% prot)ability 

50 that two means are different 

When the reproducibility of match factore for the 
four standard runs using gradient 1 were analyzed, it 
was found that the match factor ranged from 998.76 
to 1000.00, with standard deviations from less than 

55 0.001 to 1.306. This indicated that very stringent 
match criteria could be employed for spectral identity. 
Since variability of the match factor increases as 
peak concentrations decrease and since the relative 
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concentrations of the tryptic fragments from r-hGH 
should be fairly constant, rt was decided to define in- 
dividual match criteria for each entry in the calibration 
library rather than use a fixed match threshold. To be 
considered a positive match, an unknown spectrum 
had to have a match score atx>ve a threshold of three 
standard deviations below the mean match for a given 
standard. This provided a 99.8% probability that only 
correct matches were assigned. 

To establish selectivity of the calibration lit>rary, 
each standard in the calibration library was nrtatched 
against every entry from a typical sample library to 
determine the number of potential mismatches. Amis- 
match in this context was defined as a standard entry 
for which more than one match candidate was found 
with a match factor inside the confidence limits pre- 
viously established. According to certain emt>odi- 
ments, selectivity can be greatly enhanced by defin- 
ing a retention time window around a given standard 
to limit the number of search candidates. For exanrv 
pie, a retention time window of ±1 min was employed, 
incorrect matches were found for only three stan- 
dards. These mismatches were all minor peaks with 
peak heights between 3 and 6 mill i absorbance units 
(mAU) and did not correspond to any known tryptic 
fragments of r-hGH. With a ±0.5 min window, no mis- 
matches were found. It was thus concluded that with 
the selection of an appropriate retention time window, 
the calibration library for r-hGH provkJes accurate 
identification of all fragments. 

Traditional calibratton procedures for a peak 
identification such as implemented In the standard 
ChemStatton software and similar in nature to other 
commercially available software for chromatographic 
data handling where peak recognition Is based only 
on retention times resulted in mismatches for 5-8 
standards inside a ±0.5 min retention time window. 
When the window was increased to ±1 min nearly all 
standards exhibited mismatched peaks. 

Definition and Applteation of the Peak Score 

It will be appreciated by those skilled in the art 
that since chronr^tographlc conditions are not always 
stable, resolution between adjacent peaks may 
change or additional peaks may appear in a tryptic 
map. Such instability will make positive Identification 
of an unknown peak difficult, even when spectral 
matching is employed. However, in addition to peak 
spectra, other quantitative information is available for 
each peak and can be utilized in accordance with cer- 
tain embodiments of the present invention to develop 
a procedure that will assign a numerical similarity 
score to each match between a standard and an un- 
known peak. Figure 7 shows the variability of the dif- 
ferent parameters available to construct this score. 
Based on the relative standard deviations, it is obvi- 
ous that the greatest confidence can be placed in the 



match factor. It can be seen that retention time infor- 
mation and peak area and height exhibit deviation 
larger than those for the match factor by one and two 
orders of magnitude, respectively. 
5 Based on the statistical information in Figure 7, 

the peak score can be empirically derived as follows: 

PS = [loi^db + RT^ + o^[AR^'HT^)v^^.2 

(12) 

where, to avoid unrealistically high delta values, the 

10 following minimum values were established: 0.1 for 
MTdti» 0.05 min for RT^ev. and 1% for AR^ev and HTd^v. 
In this nr^nner, equation (12) accounts for the fact 
that the spectral match is the most significant para- 
meter for peak recognition and therefore is weighted 

IS most heavily. Even if all other parameters indicate a 
perfect match, a large deviation in the match factor in- 
dteates that the peak in questions has the wrong iden- 
tity. The scaling foctor of 11.2 is the sum of all weight- 
ing factors and normalizes the peak score to unit 

20 weight. 

By definition, a perfect peak score would be zero, 
a score of one will provide a 99.8% probability that 
positive matches will not be missed, but usually Indi- 
cates rather marginal similarity between standard 

25 and unknown. Peak scores for all entries in the four 
sample libraries used to construct the calibration li- 
brary ranged from 0.002 to 0.465 with an average 
score of 0.051. Because the score is open ended, it 
was somewhat arbitrarily decided that a score of two 

30 or larger indicated a totally mismatched peak. It will be 
appreciated by those skilled in the art that the prot>- 
ablllty that a positive match will result in a score of 2 
is less than 0.000002% 

35 Automated Evaluation of Digests Using a Sample 
Score 



Knowing how well a peak from a calibration libra- 
ry is matched by any given peak in an unknown sanv 

40 pie, the next step is to develop a scoring procedure 
which describes the overall similarity between all of 
the peaks in the unknown and In a calibration sample. 
The sample score as previously defined allows for the 
accounting of missed calibration peaks as well as for 

45 supernumerary peaks found in a sample. Further- 
more, the score is normalized so as to be independent 
of the number of entries in the calibration library. Nor- 
malization becomes a concern if the library is modi- 
fied. Since peak scores larger than 2 have been de- 

50 fined as mismatches, ail peak scores are truncated to 
2 so that missed and mismatched peaks have the 
same peak score. The penalty score of 1 for extra 
peaks is strictly empirical at this point; another pos- 
sible approach would be to have the penalty reflect 

55 the size of the extra peak. 

While a perfect sample score is easily defined as 
being exactly zero, a determination must be made 
concerning a criterion for what constitutes the limit 
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between a passing and a failing score. Meaningful 
limits will have to be established through statistical 
analysts of typical sample scores for reference stan- 
dards to account for variability due to different lots of 
growth hormone and trypsin, as well as overall chro- 
matographic variability. 

Figura 8a provides the sample scores for the four 
sample libraries (1 A-D) used to construct the calibra- 
tion library as well as for additional samples (2A-C 
and 3A-D) derived from the same reference standard 
but injected in different amounts. As expected, the 
calibration samples themselves (1A-D), injected at 
100|ig, show a very good score of 0.076 or less, with 
an average value of 0.050, indicative of the extreme 
similarity between all four replicates. 

The Increase in sample score for the SOiig injec> 
tions (2A-C) to an average value of 0.798 is partly due 
to a drift in chromatographic conditions resulting in re- 
solution changes for several peaks. The co-eluting 
fragments T14a and T14c were separated into two 
peaks, each with a spectrum different from the com- 
posite spectrum contained in the calibration library. 
The partially resolved peak pair T11 and T10c2 (Fig- 
ure 2) was not separated at all and, consequently, nei- 
ther fragment was identified. Furthermore, the frag- 
ment with the lowest concentration (T1 9) was not de- 
tected at this smaller sample size. 

The 200^g injections (3A-D) show an average 
score of 0.443, and thus fall t>etween the 100 and the 
50^g samples. The increased sample score results 
from the same problematic peaks encountered with 
the 50^g injection. In both the 50 and the 200^g in- 
jection, the additional standard peaks which were 
missing were all small peaks of unknown identity. 
This indicated that the significance of these uniden- 
tified peaks with respect to sample identity needed to 
be investigated in some more detail. 

For the phosphate gradient systems (gradient II) 
similar data are shown in Figure 8b. Again, the four 
calibratk>n samples (1A-D) exhibit very low scores of 
0.084 and less, with the average at 0.036. An addi- 
tional sample (2), which also contains reference ma- 
terial but which was analyzed at a different time, 
shows a higher score of 0.671. This score is in the 
range of scores obtained for the 50 and 200\xg injec- 
tions of reference material with gradient I. Closer in- 
spection revealed that here, too, changes in peak re- 
solution had an adverse affect on the sample score. 

In order to provide data on the kind of sample 
score obtained with a sample known to differ from the 
standard, samples of r-hGH which was oxidized prior 
to digestion with trypsin were analyzed to simulate 
potential degradatton pathways. As can be seen quite 
clearly in Figure 8b, 3A-D, the average sample score 
of 1.692 lies significantly above the scores obtained 
for reference material and reflects the difference be- 
tween oxidized and native r-hGH. Furthermore, re- 
producibility for the four samples is very good, indi- 



cative of the similarity among replicate injecttons of 
the oxidized samples. 

To relate this abstract score to the more tradition- 
al visual method of evaluation. Figure 9 shows a chro- 

5 matogram for the oxidized r-hGH digest. Peaks that 
disappeared due to oxidation and those peaks that 
appear as new fragments and are not encountered in 
native r-hGH are clearly labeled. 

Thus, although it is obvious that the chromato- 

10 gram in Figure 9 differs considerably from the stan- 
dard fragmentation pattern as indicated by the ar- 
rows, the present invention provides some dear ad- 
vantages in reducing the potential for incorrect peak 
matching: (1) the entire evaluation procedure can be 

IS automated to obtain a final sample score without the 
need for operator intervention; and (2) the scoring 
procedure is completely digital and therefore not suk>- 
ject to ot)8erver bias. 

Turning to Figures 1 0-13. application of the meth- 

20 od of the present invention will be described. It should 
be understood that where input is to be supplied to a 
program or subprogram said input can be provided in 
interactive mode by an operator or can be taken di- 
rectly from a file containing the pertinent information. 

25 The subprogram Make-Library (Figure 1 0) imple- 

ments the reduction of raw data to the two data sets 
described in this invention. User input specific to this 
subprogram, such as the names for input and output 
files, wavelength selectton, and integration paranne- 

30 ters, is supplied at step 101. 

The file retrieved at step 102 is a raw data file 
containing absorbance data as a function of both wa- 
velength and time as would be appropriate for the in- 
formation generated by a diode-array detector. Any 

35 such format could in principle be processed by the 
subroutine, provided that low level routines for inter- 
pretation of the file format are available. In the pre- 
ferred emkxxliment of the inventton the format of raw 
data is that produced by the Hewlett-Packard (HP) di- 

40 ode-array detector. 

After raw data have been retrieved from the mag- 
netic media, an appropriate signal characterizing the 
chromatographic peak respK)nse is chosen for analy- 
sis of peak data at step 103. A typk^al peak response 

45 would be the absorbance as a function of time at spe- 
cific wavelength or wavelength range, selected such 
that all compounds of interest will exhibit absorbance 
at said wavelength or wavelength range. However, it 
is possible to use the average or maximum absor- 

50 bance over the wavelength range recorded— or a sut>- 
range thereof-as the peak response at a given time 
point 

Once a signal has been determined, the subpro- 
gram finds all peaks for this signal in step 1 04 by em- 
55 ploying standard integration algorithms as imple- 
mented on the HP ChemStation or any other such al- 
gorithm similar in nature to those customarily enrv 
ployed in chromatographic data handling. The result 



11 



21 



EP 0 437 829 B1 



22 



of the peak finding step is the determination of peak 
start, end, apex (retention time), area, and height, as 
well as of the number of peaks encountered, which is 
assigned to variable P in step 105. 

At step 106, a library file is created which will lat- 
er receive relevant peak data as generated in subse- 
quent porttons of this subprogram. This library file is 
typically referred to as a sample library. 

Next a counter is initialized to a value of 1 at step 
107 and the apex spectrum for the peak indexed by 
the counter is found by the subprogram at step 108. 
Appropriate reference spectra are then selected at 
step 109, typically at the beginning and end of the 
peak where normally only the solvent background Is 
present. Other criteria for the selectton might be eno- 
ployed, especially in cases where neighboring peaks 
are not fully separated. The number of reference 
spectra employed may also be varied depending on 
the characteristics of the chronnatographic system 
employed. 

In step 110, the reference spectra are then used 
to remove unwanted background absorbance from 
the apex spectrum in order to obtain a peak spectrum 
characteristic of the current peak. Although a number 
of different approaches can be used to construct this 
background correction, the preferred mode is to use 
linear interpolation of the reference spectra to the re- 
tention time of the apex spectrum and to subtract the 
interpolated spectrum from the apex spectrum. An- 
other approach would, for example, involve principal 
component analysis of the solvent background fol- 
lowed by linear least squares subtractk)n. 

At step 111, an optional wavelength calibration 
can be applied to the peak spectrum by shifting the 
wavelength axis left or right by a constant wavelength 
amount as determined previously outside the scope 
of the subprogram. This background correction is im- 
portant primarily in cases were data for different sanrv 
pies might be obtained from different instruments or 
be derived over long periods of time on the same in- 
strument 

At step 112 any number of possible mathematical 
treatn>ents can be applied to the peak spectrum. Ex- 
amples of such treatments are smoothing, the forma- 
tion of higher order derivatives, splining of the wave- 
length axis to obtain better resolution, or any transfor- 
mation of the spectrum. 

The peak spectrum is transferred to the sample 
library at step 11 3 and the other peak data for the cur- 
rent peak as determined during the integration step 
(104) are transferred to the sample library at step 114. 
Finally, at step 115, the counter is incremented and 
checked against the number of peaks P in step 116. 
If another peak needs to be processed the subpro- 
gram returns to step 108, otherwise the subprogram 
execution is complete. 

The Compare-Ubs subprogram (Figure 11) pro> 
vkles for most of the detailed matching between any 



two samples presented to the subprogram in fbnm of 
a sample library for each sample. In implementing 
this subprogram, the first sample is considered to be 
the reference or standard sample to be matched by 

5 the second sample. It will, however, be understood 
that the first sample can be of completely unknown 
nature, as can the second sample. It should also be 
understood that a 'sample library' can contain data 
from either a single analysis of a sample processed 

10 by the Make-Library sut>routine or data derived from 
multiple analyses of the same sample as they would 
be correlated by the Make-Std-Library subprogram 
from sample libraries generated with the Make- 
Library subprogram. 

15 In step 201, user parameters pertinent to this 

subprogram, are requested. User parameters include 
the names of the sample libraries involved as well as 
parametera describing the characteristics of the 
matching process. 

20 In step 202 the first (reference) sample library is 

retrieved from magnetic media and is referred to as 
LI. The number of peaks stored in this library is de- 
termined and assigned to variable P1 in step 203. 
Steps 204 and 205 repeat the previous two steps 

25 for the second sample library, assigning the library 
name to L2 and the numt)er of peaks P2, respectively. 

Step 206 consists of a retention time correctbn, 
whereby reference peaks defined in the reference 
sample and expected to occur at the retentk>n times 

30 stored in the reference sample library are compared 
against the retention times actually encountered in 
the second sample. Appropriate corrections are per- 
formed to the retention times of the second sample to 
make them correspond to those of the first sample. 

35 Any one of a variety of possible procedures can be 
employed in this correction process, the simplest of 
which is piecewise linear fit between expected and 
actual retentton times. Those skilled in the art will rec- 
ognize that this correction may not be necessary. In 

40 step 207 peak areas and peak heights of both the first 
and second samples can t>e normalized in a number 
of ways. Two possible methods are normalization to 
the total area and height of all peaks in either sample 
such that alt peaks are scaled to obtain an arbitrarily 

45 selected constant value for these parameters or to the 
area and height of selected reference peaks where 
normalization implies that all peaks are scaled to ot>- 
tain arbitrarily selected constant values for these ref- 
erence peaks. Depending on the nature of the chn>- 

50 matographic separatton applied to the two samples, 
this step may not prove necessary. 

Next, two countere are initialized in step 208, one 
for the peak currently to be matched is set to 1 (i). the 
other one (k) will count the number of matches found 

55 for the current peak up to a maximum of 10 which will 
be stored in a table of match values. 

In step 209, relevant peak data for the peak cur- 
rently indexed by i are retrieved from LI and a reterv 
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lion time window centered upon the retention time of 
the current peak is constructed in step 210. This re- 
tention time window depends on knowledge of the 
chromatographic system employed in the separation 
of the first and second samples and can be extended 5 
to the total time spanned by the analysis of the first 
sample. 

A second peak counter (j) for peaks in the second 
sample is initialized to 1 in step 211 and data for the 
peak indexed by j are retrieved from L2 at step 212. io 
A branch point is provided at step 213 which tests 
whether peak j is inside the retention time window de- 
fined in step 210. If it is not, control passes to step 
221. Otherwise, the subprogram continues on to step 
214, where MTdi, and CN are calculated for the data is 
from peaks i in LI and j in L2. 

Those skilled in the art will recognize that the cal- 
culation of MTdis and CN can be done in a number of 
different ways as described elsewhere in this inven- 
tion depending on the amount of information avail- 20 
able for each peak such as multiple or average or in- 
divkJual spectra for the first or the second or both 
samples. 

In step 215 any or all of the deviations defined in 
equations (4)-(8) are calculated from the relevant 25 
data for peak i in LI and peak j in L2. 

Next, in step 216, the number (k) of matches 
found so far is compared against the maximum nunrv 
ber of matches allowed, which is arbitrarily set a con- 
stant value of 10, but could be modified to any other 30 
meaningful value. If less than 10 nnatches have been 
found the match counter is incremented in step 219. 
Otherwise, the match for the current peak is consid- 
ered better than any of those currently stored. The 
match with the lowest score Is deleted in step 21 8 and 35 
executk>n proceeds to step 220. Otherwise, control is 
transferred to step 221 . 

At step 220 the two t>ranches of step 21 6 and the 
yes branch of step 217 converge again and the match 
information for the current peak is inserted into the 40 
match table at the appropriate position. 

At step 221 the counter J for the current peak in 
L2 is incremented and tested in step 222 against P2, 
the total number of peaks in L2. If j exceeds P2, the 
subprogram continues with step 223; otherwise, the 4S 
next peak from L2 is processed by returning to step 
212. 

In step 223 the counter i for peaks in LI is incre- 
mented and tested against PI, the total number of 
peaks in LI in step 224. If i exceeds LI, the subpra- so 
gram continues with step 225; otherwise the next 
peak from LI is processed by returning to step 209. 

In step 225 peak assignment takes place be- 
tween ail peaks In LI and all matches in the match ta- 
ble such that all conf I icts are resolved by the hierarch- 55 
ical assignment procedure described in this invention. 
No more than one peak from L2 is assigned to each 
peak of LI and no peak from L2 is assigned to nrK>re 
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than one peak from LI . 

Once peak assignment is complete, the peak 
score PS as defined in equation (10) is calculated in 
step 226 for each pair of matched peaks found in step 
225 and the subprogram is terminated. 

The Make-Std-Library subprogram (Figure 12) is 
used to correlate data from one or several sample li- 
braries to arrive at a standard library which contains 
statistical information derh^ed from data sets 1 and 2 
for all peaks, as well as from the original data from the 
individual libraries. 

At step 301 user input is requested and assigned 
to variable L. User input may include information such 
as file names and the number of sample libraries to 
be processed. 

Next, in step 302, a temporary scratch library 
TEMP is created which will be used in the correlatbn. 
This library initially contains peak data on all peaks in 
the first sample library. 

At step 303. a counter is initialized to 2 and tested 
in step 304 against the total number L of sample librar- 
ies. If the counter exceeds L the conrelation process 
is complete and statistical processing commences at 
step 313. Otherwise, the subprogram proceeds to 
step 305. 

At step 305 the current library indexed by j is conrv 
pared to TEMP using the subroutine Compare-Libs 
described above. The invocation of Compare-Lit>s 
will result in an assignment between peaks in TEMP 
as reference library and peaks in the current sample 
library. Peak assignment between a given pair of 
peaks is considered positive if the peak score as re- 
turned by Compare-Libs is above a user-selected 
threshold. Any peaks in the current library not as- 
signed to a peak from TEMP are then removed, to- 
gether with all relevant peak data in step 306. 

Step 307 initializes a second counter j to a value 
one lower than the current value of 1. Steps 308 to 31 1 
will delete all peaks in TEMP that were not matched 
by any peak in the current sample library or the cor- 
responding peaks in all sample libraries already proc- 
essed. Therefore, after step 311 all sample libraries, 
up to the current one. and library TEMP contain the 
same number of peaks which are alt correlated on a 
one by one basis. 

If j tests larger than 0 in step 308, the subprogram 
proceeds to step 309 where all peaks corresponding 
to unmatched peaks in TEMP will be deleted in the 
sample library index by j. In step 310) is then decre- 
mented and execution returns to step 308 until j tests 
equal to zero (0), in which case the subprogram con- 
tinues with step 311 . At that point the subprogram de- 
letes the unmatched peaks from TEMP itself in step 
311. increments counter i in step 312, and returns to 
step 304. 

Beginning at step 31 3. statistical processing of all 
sample libraries correlated takes place. Program con- 
trol is transferred to this step from step 304 if the test 
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there indicates that alt libraries have been processed 
(i.e.. counter i exceeds the value of L). 

In step 313 the number of peaks remaining in 
TEMP — and thus in all sample libraries — is deter- 
mined and assigned to variable P. Anew library file is 
created In step 314 to receive the data generated by 
the sut)sequent processing steps. This will be the 
standard library produced by the subprogram. 

Counter i Is again initialized to 1 in step 315 and 
the peak spectrum for the peak indexed by 1 is trans- 
ferred from each sample library to the standard libra- 
ry in step 316. An average spectrum is calculated 
from the individual peak spectra and also stored in 
the standard library in step 317. 

Individual peak data for the current peak from 
each of the sample libraries are transferred to the 
standard library In step 318. This is followed by peak 
data averaging in each category, which data are stor- 
ed in step 319. 

In step 320 all appropriate spectral matches Ma 
are calculated from the individual and average spec- 
tra and transferred to the standard library in step 321 . 

The counter is then incremented in step 322 and 
tested against the total number of peaks P. If i exceeds 
P. the program is terminated. Otherwise, the next 
peak is processed by returning to step 316. 

The Get-Sample-Score program (Figure 13) in- 
corporates the previously descrit>ed subprograms to 
arrive at an overall sample score indicative of the sinrv 
ilarity between any two samples analyzed by the 
same or different chromatographic conditions on the 
same or different instruments. The overall procedure 
that results in the sample score will also identify 
those peaks in the two samples that can be consid- 
ered to be derived from the same chemical compound 
present in the two samples. 

The overall procedure assumes that raw data for 
the number of repi icates Ri and R2 defined for the first 
and second sample, respectively, are available. This 
does not preclude the possibility that these data are 
generated concurrently with execution to Get-Sanv 
pie-Score. Such concurrent generation would enable 
completely unattended operatk>n of the overall sanrv 
ple scoring procedure. 

In step 401 user input specific to the overall 
matching procedure is requested. Such input includes 
such items as file names, match criteria for Compare- 
Libs, criteria for correlation of sample libraries by 
Make-Std-Library, and the weighting factors used for 
the calculation of sample score. 

In step 402 a standard library (SI) characteristic 
of the first sample and containing data for Ri repli- 
cates can be provided. If one is available, program 
execution is transferred to step 410. Otherwise, a 
standard library is generated in steps 403 through 
409. 

In step 403 input is requested concerning the 
number of replicates for the first sample and assigned 



to the variable R^. Next, a counter is initialized to 1 in 
step 404 and the raw data for the replicate analysis 
of the first sample as indexed by the counter is re- 
trieved in step 405. Subroutine Make-Library is in- 

5 voked in step 406 to produce a sample library for the 
current replicate. The counter is incremented in step 
407 and if more replicates are to be processed as test- 
ed in step 408 the program returns to step 405. Other- 
wise, subprogram Make-Std-Lit>rary is called next in 

10 step 409 to generate a standard library S1 from the 
individual sample libraries. 

In step 410 a standard library (S2) characteristic 
of the second sample and containing data for R2 rep- 
licates can t>e provided. If such a standard is avall- 

15 able, program execution is transferred to step 418. 
Otherwise, a standard library is generated in steps 
411 through 417. 

In step 411 input is requested concerning the 
number of replicates for the second sample and as- 

20 signed to variable R2. Next, a counter is Initialized to 
1 in step 412 and the raw data for the replicate ana- 
lysis of the second sample as indexed by the counter 
is retrieved in step 413. The subroutine Make-Library 
is invoked in step 414 to produce a sample library for 

25 the current replicate. The counter is incremented in 
step 415 and if more replicates are to be processed 
as tested in step 41 6, the program returns to step 41 3. 
Otherwise, subprogram Make-Std-Ubrary is called 
next in step 417 to generate a standard library S2 

30 from the individual sample libraries. 

In step 418 subprogram Compare-Ubs is used to 
match standard libraries 81 and S2, resulting in out- 
put in step 41 9 of peak assignment and peak scores 
for each peak in the first sample. From the individual 

35 peak scores the overall sample score can be calcu- 
lated based on equation (11) in step 420. Step 421 
provides for output of the sample score to an appro- 
priate device and in step 422 a final report is gener- 
ated which could incorporate information on reprodu- 
ce clbility and confidence intervals previously obtained 
for sample scores from the two samples in question 
to make a decision as to whether or not the two sanv 
ples are identical. At this point program executton is 
complete. 

45 

Claims 

1 . A method for distinguishing a first chemical com- 
50 pound from a second chemical compound on the 

basis of chromatographic data wherein sakj 
chemical compounds absorb ultraviolet radiation, 
comprising the steps of: 

exposing at least one of the chemical conrv 
55 pounds one or more times to one or more select- 

ed wavelengths of ultraviolet radiation; 

recording the respective absorbances of 
at least one of the chemical compounds upon 
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each exposure to said ultraviolet radiation; 

providing a first data set to processing 
means, said first data set comprising the respec- 
tive absorbances for the first and second chemi- 
cal compounds upon one or more exposures to 5 
one or more selected wavelengths of ultraviolet 
radiation; and 

providing at least one spectral match fac^ 
tor by applying, via the processing means, a 
spectral matching function to the first data set; io 

characterized in that the method further 
comprises the steps of: 

deriving at least one match discriminator 
(MTdb) f St least one of the at>ove-mentioned 
spectral match factors; is 

providing at least one peak score by apply- 
ing, via the processing means, a peak scoring 
function to the first data set, wherein providing at 
least one peak score comprises the steps of: 

providing to the processing means 20 
weighting factors (fm, fr, fa) for the match discrinr>- 
inators, retention time deviations (RTdev). and 
area and height deviations (AHdev). respectively, 
wherein the weighting factor (f^) for the match 
discriminators is greater than the weighting factor 25 
(fr) for the retention time deviations and the 
weighting fector (fj for the area and height devia- 
tions; and 

applying the peak scoring function 
to the match discriminators, the retention time 30 
deviations, and the area and height deviations 
according to: 

PS = ((f„,MTdb) + (frRTdev) + (faAHd„))/NF 

where PS is peak score, f^, is a 
weighting factor for the match discriminator, fr is 35 
a weighting factor for the retention time deviation, 
and fa is a weighting factor for the area and height 
deviation, and NF is an empirically derived nor- 
malization factor; 

distinguishing the first chemical com- 40 
pound from the second chemical compound on 
the basis of the peak score (PS). 

The method of daim 1 characterized in that the 
match discriminators (MTd,) are derived accord- 4S 

ing to: 

MTdb = D/T(DF.prob) 
where MTdis is the match discriminator, D is the 
difference for the mean match factor derived 
from automatching and crossmatching functions, so 
DF is the degrees of freedom which are calculat- 
ed from the number of individual spectra for the 
first and second chemical compounds, and T(DF- 
,prob) is the t-value required for a desired degree 
of probability (prob, in %) that two means differ- 55 
ing by that t- value are different given the degrees 
of freedom applicable. 



3. The method of daim 1 or 2 characterized in that 
the spectral matching function is applied to the 
first data set according to: 

MFg = 1000(1 -r2) 
where MFg is a general match factor and r is a cor- 
relation coefficient which relates the absorbanc- 
es for the first chemical compound at selected 
wavelengths to the absort>ances for the second 
chemical compound at the same wavelengths. 

4. The method of daim 1 or 2 characterized in that 
the spectral matching function is applied to the 
first data set and to the average absort>ances ac- 
cording to: 

MFa = 1000(1 -r2) 
where MF^ is an automatch factor and r is a cor- 
relation coefficient which relates the individual 
absorbances of a chemical compound at selected 
wavelengths to the average absort>ances for the 
same chemical compound at the same wave- 
lengths. 

5. The method of daim 1 or 2 characterized in that 
the spectral matching function is applied to the 
first data set and to the average absoribances ac- 
cording to: 

MFx = 1000(1 
wherein MF^ is a crossmatch fector and r is a cor- 
relation coefficient which relates the individual 
absorbances for one of the chemical compounds 
at selected wavelengths to the average absor- 
bances for the other chemical compound at the 
same wavelengths. 

6. The method of one of the clai ms 3 to 5 character- 
ized in that r is applied according to: 

r _ [(Sxy) - GSxXZyyn,] 

[{2x2 _ ^)2/n,){iy2 _ (£y)2/n,)r« 
where x and y, respectively, are the at>sorbances 
of the first and second chemical compounds at 
the same wavelength, or where x and y, respec- 
tively, are individual and averaged absorbances 
for the same chemical comp>ound at the same wa- 
velength or where x and y, respectively, are the 
individual absorbances for one chemical com- 
pound and averaged absorbances for the other 
chemical compound at the same wavelength, and 
where £ is the summation function, and nf is the 
number of selected wavelength. 

7. The method as daimed in one of the claims 1 to 
6 characterized by the step of preparing the first 
data set after providing said data set to the proc- 
essing means, wherein the step of preparing the 
first data set comprises the step of: 

selecting a portion of the data set; and 
calibrating the selected portion. 
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8. The method as claimed in one of the daims 1 to 
7 characterized in that the step of providing said 
at least one match discriminator (MT^jb) compris- 
es the step of applying, via the processing 
means, a match discrimination function to gener- 
al match factors. 

9. The method as claimed in one of the dalms 1 to 
8, characterized in that the step of providing said 
at least one retention time deviation (RTdev) com- 
prises the step of applying, via the processing 
means, a retention time deviation function to the 
retention times according to: 

RTdev = IRTi- RT2I /RT„„ 
wherein RT^ev retention time deviation, RT^ is 
the average of retention times for the first chem- 
ical compound, RT2 is the average of retention 
times for the second chemical compound, and 
RTiim is a limit of variability for retention times. 

10. The method as daimed in one of the daims 1 to 

9 characterized by the steps of: 

providing at least one peak area deviation 
(ARdmr) by applying, via the processing means, a 
peak area deviation function to the peak areas; 
and 

further distinguishing the first chemical 
compound from the second chemical compound 
on the basis of at least one peak area deviation; 
wherein the peak area deviatton function is ap- 
plied to the peak areas according to: 

ARdov = IaRi - ARjI/AR,,^ 
wherein AR^^ is peak area deviation, AR^ is the 
average of peak areas for the first chemical com- 
pound, AR2 is the average of peak areas for the 
second chemical compound, and ARjim is a limit 
of variability for peak area. 

11. The method as daimed in one of the daims 1 to 

10 characterized by the steps of: 

provkling at least one peak height devia- 
tion (HTd«v) by applying, via the processing 
means, a peak height deviation function to the 
peak heights; and 

further distinguishing the first chemical 
compound from the second chemical compound 
on the basis of at least one peak height deviatk>n; 

wherein the peak height deviation f unctton 
is applied to the peak heights according to: 

HTd«, = iHTi - HT2I/HT1,™ 
where HT^ is peak height deviatk>n, HTf is the 
average peak heights for the first chemical com- 
pound, I-IT2 is the average peak heights for the 
second chemical compound, and HTiim is a limit 
of variability for peak height 

12. The method as daimed in one of the daims 1 to 

11 characterized in that the step of providing at 



least one area and height deviation (AH<tov) conv 
prises the step of applying, via the processing 
means, an area and height deviation function to 
the peak area deviations and the peak height de- 
5 viattons according to: 

AHaev = (ARd«v + HT^V2 
wherein AHdey is area and height deviation, AR^ev 
is peak area devlatton, and HTcjev is peak height 
deviation. 

10 

PatentansprOche 

1. Eln Verfahren zum Unterscheiden einer ersten 
IS chemischen Verbindung von einer zweiten die- 

mischen Verbindung auf der Basts von chromato- 
graphischen Daten, bei dem die chemischen 
Komponenten ultravkriette Strahlung absorbie- 
ren, welches folgende Schritte einschlie&t 
20 ein- Oder mehrmallges Aussetzen von mlnde- 

stens einer der chemischen Verbindungen einer 
Oder mehreren ausgewahlten Wellenlangen von 
ultravioletter Strahlung; 

Aufzeichnen der jewel ligen Absorptlonsvermo- 

25 gen von mindestens einer der chemischen Kom- 

ponenten, jedesmal wenn sie der ultra violetten 
Strahlung ausgesetzt ist; 
Liefern eines ersten Datensatzes zu einer Verar- 
beitungseinrichtung, wobet der erste Datensatz 

30 die jewel ligen At>sorptionsvermogen fur die erste 

und die zweite chemische Komponente, die ein- 
oder mehrmals einer oder mehreren ausgewahl- 
ten Wellenlangen von ultravioletter Strahlung 
ausgesetzt waren, aufweist; und 

35 Schaffen mindestens eines spektralen Anpas- 

sungsfaktors durch Anwenden, uber die Verarbei- 
tungseinrichtung, einer spektralen Anpassungs- 
funktion auf den ersten Datensatz; 
dadurch gekennzeichnet. da& das Verfahren fer- 

40 net folgende Schritte aufweist: 

Ableiten mindestens eines Unterscheidungs- 
werts (MTdb) aus mindestens einem deroben ge- 
nannten spektralen Anpassungsfaktoren; 
Schaffen von mindestens einer Spitzenwertung 

45 durch Anwenden, uber die Verarbeitungseinrich- 

tung, einer Spitzenwertungsfunktlon auf einen 
ersten Datensatz, wot>ei mindestens eine Spit- 
zenwertung geschaffen wird, wobei es folgende 
Schritte einschlie&t: 

50 Liefern von Gewichtungsfektoren (f^. fr, fa) fur die 

Anpassungsunterscheidungswerte, Retentions- 
zeitabwek^hungen (RT^ev) und Flachen- bzw. H5- 
hen-Abweichungen (AHdov) zu der Verarbei- 
tungseinrichtung, wobei der Gewichtungsfaktor 

55 (fm) fur die Anpassungsunterschekjungswerte 

grower ist, als der Gewichtungsfaktor (fr) fur die 
Retentionszeitabweichungen und der Gewich- 
tungsfaktor (fa) fur die FIdchen- und H5hen-Ab- 
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weichungen; und 

Anwenden der Spttzenwertungsf unktion auf die 
Anpassungsunterschetdungswerte, die Retenti- 
onszeitabwetchungen und die Rachen- und Hd> 
hen-Abweichungen gemaH folgender Gteichung: 5 
PS = ((f^MT^e) + (frRTae.) + (f«AHd«))/NF 
bei der PS die Spitzenwertung ist, ein Gewich- 
tungsfaktor fOr den Anpassungsunterschei- 
dungswert ist. f, ein Gewichtungsfaktor fQr die 
Retentionszeitabweichung ist, fa ein Gewich- io 
tungsfaktor fur die Flachen- und Hdhen-Abwei- 
chung ist. und NF ein empirisch abgeleiteter Nor- 
miervingsfaktor ist; 

wobei die erste chemische Verbindung von der 
zweiten chemischen Verbindung auf der Basis is 
der Spitzenwertung (PS) unterschleden wind. 

2. Das Verfahren nacli Anspruch 1, dadurch ge- 
kennzeichnet, da& die Anpassungsunterschei- 
dungswerte (MTdis) gemad folgender Gleichung 20 
at>geleitet werden: 

IVITdb = D/T(DF,prob) 
bei der l/TTdis der Anpassungsunterscheidungs- 
wert ist, D der Unterschied fur den mittleren An- 
passungsfaktor, der von den automatischen 25 
Anpassungs- und den Oberkreuzan-passungs- 
Funktionen abgeleitet ist, ist, DF die Freiheitsgra- 
de sind, die aus der Anzahl der einzelnen Spek- 
tren fur die erste und die zweite chemische Ver- 
bindung berechnet sind, und T (DF.prob) der t- 30 
Wert ist der f Or einen gewQnschten Wahrschein- 
lichkeitsgrad (prob. in Prozent), dali zwei Mittel- 
werte, die sich um diesen t-Wert unterscheiden, 
verschieden sind, vorausgesetzt die Freiheits- 
grade sind anwendbar, erforderlich ist 35 

3. Verfahren nach Anspruch 1 Oder 2, dadurch 
gekennzeich net da& diespektraleAnpassungs- 
f unktk)n gemift folgender Gleichung auf den er- 
sten Datensatz angewendet wird: 40 

MFg = 1000(1 -r2) 
bei der MFg ein ailgenneiner Anpassungsfaktor ist 
und r ein Korrelationskoeff izlent ist der die Auf- 
nahmefahigkeiten fur die erste chemische Ver- 
bindung bei ausgewahlten Wellenlangen mil den 45 
At>sorptionsvermdgen der zweiten chemischen 
Verbindung t>ei den gleichen Wellenlangen in Be- 
zlehung setzt. 

4. Verfahren nach Anspruch 1 Oder 2, dadurch ge- so 
kennzeichnet da& die spektrale Anpassungs- 
funktion gemaB folgender Gleichung auf den er- 
sten Datensatz und auf das gemittelte Absorpti- 
onsvermogen angewendet wird: 

MFa = 1000(1 - r2) 55 
bei der MFg ein automatischer Anpassungsfaktor 
ist und r ein Korrelationskoeff izient ist der die 
einzelnen Absorptionsverm5gen einer chemi- 



schen Verbindung bei ausgewahlten Wellenlan- 
gen mit den gemittelten Absorptionsvermogen fur 
die gleiche chemische Verbindung bei den glei- 
chen Wellenlangen in Beziehung setzt. 

5. Verfahren nach Anspruch 1 oder 2, dadurch ge- 
kennzeichnet da& die spektrale Anpassungs- 
f unktion gema& folgender Gleichung auf den er- 
sten Datensatz und auf die gemittelten At>sorpti- 
onsvermogen angewendet wird: 

MF, = 1000(1 - r2) 
bei der MFx ein Uberkreuzanpassungsfaktor Ist 
und r ein Korrelationskoeff izient ist der die ein- 
zelnen Absorptions vermdgen einer der chemi- 
schen Verbindungen bet ausgewShlten Wellen- 
langen mit den gemittelten Absorptionsvermfigen 
der anderen chemischen Verbindungen bei den 
gletehen Wellenldngen in Beziehung setzt 

6. Das Verfahren nach einem der AnsprOche 3 bis 
5. dadurch gekennzeichnet, da& r gemaB folgen- 
der Gleichung verwendet wird: 

_ ^ [(Sxy) - (£x)(£yynf] 

UTx^ - (Lx)2/n,){Sy2 - (Zy)2/n,}li« 
bei der x bzw. y die Absorptionsvermogen der er- 
sten und der zweiten chemischen Komponenten 
bei der gleichen Weltentange sind, oder be\ der x 
bzw. y einzelne und gemittelte Absorptkmsver- 
mogen fur die gleiche chemische Komponente 
bei der gleichen Wettenlange sind, oder bei der x 
bzw. y die einzelnen Absorptionsvermogen f urei- 
ne chemische Komponente und die gemittelten 
Absorptionsvermogen fur die andere chemische 
Komponente bei der gleichen Wellenl3nge sind, 
bei der £ die Summatlonsf unktion ist und bei der 
ff die Anzahl der ausgewShlten WeilenlSngen ist 

7. Das Verfehren nach einem der AnsprOche 1 bis 6, 
gekennzeichn^ durch den Schritt des Vorfoearbei- 
tens des ersten Datensatzes nach dem Liefern des 
Datensatzes zu der Verabeitungseinrichtung, bei 
dem der Schritt des Vorbearbeitens des ersten 
Datensatzes folgende Schritte einschlie&t 
Auswahlen eines At>schnitts des Datensatzes; 
und 

Kallbrieren des ausgewahlten Abschnittes. 

8. Das Verfahren nach einem der AnsprOche 1 bis 
7, dadurch gekennzeichnet da& der Schritt des 
Schaffens von mindestens einem Anpassungs- 
unterscheidungswert (MTdb) den Schritt des An- 
wendens, uber die Verarbeitungseinrichtung, ei- 
ner Anpassungsunterscheidungswert-Funktion 
auf die allgemeinen Anpassungsfektoren ein- 
schlie&t 

9. Das Verfahren nach einem der AnsprOche 1 bis 
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8, dadurch gekennzeichnet daft der Schritt des 
Schaffens der mindestens einen Retentionszeit- 
abweichung (RT^j^) den Schritt des Anwendens, 
Gber die Verarfoeitungseinrichtung, einer Retenti> 
onszeitabweichungsfunktion auf die Retentions- 5 
zeiten genna& folgender Gleichung einschliedt: 

RT^ev = iRTt - RTzl/RTita 
bel der RT^ die Retentionszeitabwelchung ist, 
RTi die Mittelung der Retentkinszeiten fur die er- 
ste chemische Verblndung ist, RT2 die Mittelung io 
der Retentionszeiten fur die zweite chemische 
Verbindung ist, und RTn^ eine Grenzefurdie Ver- 
3nderlichkeit der Retentionszeiten ist 

10. Das Verfahren nach einem der Anspruche 1 bis is 

9, gekennzeichnet durch folgende Schritte: 
Schaffen mindestens einer Spitzenflachenab- 
weichung (ARdw) durch Anwenden, uber die Ver- 
arbeitungseinrichtung, einer Spitzenfldchenab- 
welchungsfunktion auf die Spitzenflichen; und 20 
ferner Unterscheiden der ersten chemischen 
Komponente von der zweiten chemischen Kom- 
ponente auf der Basis von mindestens einer Spit- 
zenflachenabweichung; 

bei dem die Spitzenf lachenabweichungsf unktion 25 
gemafi folgender Gleichung auf die Spitzenfla- 
chen angewendet wird: 

AR^ = IaRi - AR2l/AR„„, 
beiderARdev die Spitzenf ISchenabweichung ist, 
AR^ die Mittelung derSpitzenf lachen f urdie erste 30 
chemische Verbindung ist, AR2 die Mittelung der 
Spitzenflachen fur die zweite chemische Verbin- 
dung ist, und ARjim eine Grenze fur die Verander- 
lichkeit der SpitzenflSche Ist 

35 

11. Das Verfahren nach einem der Anspruche 1 bis 

10, gekennzeichnet durch folgende Schritte: 
Schaffen von mindestens einer Spitzenhfihenab- 
weichung (HTdav) durch Verwenden. uberdie Ver- 
arbeitungseinrichtung, einer Spitzenhdhenab- 40 
weichungsf unktion auf die SpitzenhOhen; und 
ferner UnterscheMen der ersten chemischen 
Komponente von der zweiten chemischen Konv 
ponente auf der Basis von mindestens einer Spit- 
zenhdhenabweichung; 45 
bei der die Spitzenhohenabweichungsfunktion 
gemal^ folgender Gleichung auf die Spitzenhd- 

hen angewendet wird: 

HTde. = IhTi - HT2l/HT„„, 
bei der HTdev die Spitzenh5henat>weichung ist, so 
HTi der Mittelwert der Spitzenhohen fur die erste 
chemische Komponente, HT2 der Mittelwert der 
Spitzenh5hen fQr die zweite chemische Verbin- 
dung, und HT||m eine Grenze fOrdie Ver3nderlich- 
keit der Spitzenhdhe ist. 55 

1Z Das Verfahren nach einem der Anspruche 1 bis 

11, dadurch gekennzeichnet, da& der Schritt des 



Schaffens von mindestens einer Rachen- und 
Hdhen- Abweichung (AHdav) den Schritt des An- 
wendens, uber die Verarbeitungsernrichtung, ei- 
ner Flachen- und Hohen-Abwelchungs-Funktion 
auf die Spitzenflachenabweichungen und die 
Spitzenhdhenabweichungen gema& folgender 
Gleichung einschlie&t 

AH*^ = (ARae. ♦ HT^)/2 
bei der AH^ev die Flachen- und -Hohen-Abwei- 
Chung ist, AR^av die Spitzenfl§chenabweichung, 
und HTdev die Spitzenhohenabweichung ist 



Revendlcatlons 

1 . Proc6d6 pour distinguer un premier compost chi- 
mique d'un second compost chimique sur la 
base de donndes de chromatographie, lesdits 
compost chimlques absorbant une radiation ul- 
traviolet te, comprenant les stapes consistent d: 

exposer au moins Tun des composes chi- 
miques une ou plusieurs fois d une ou plusieurs 
longueurs d'onde s6lectionn6es de radiation ul- 
traviolette; 

enregistrer les absorbances respectives 
d'au moins un des composes chimlques lors de 
chaque exposition d ladite radiation ultravtolette; 

fournir un premier ensemt>le de donn6es d 
des moyens de traitement, ledit premier ensenrv 
ble de donn6es comprenant les absorbances res- 
pectives pour les premier et second composes 
chimiques lors d'une ou plusieurs expositions d 
une ou plusieurs longueurs d'onde selection n^es 
de radiation uttraviolette; et 

fournir au moins un facteur de coTnctdence 
spectrale en appllquant. via les moyens de trai- 
tement, une fonction de coTnddence spectrale au 
premier ensemble de donn6es; 

caract6ris6, en ce qu'il comprend ^ale- 
ment les dtapes consistant d: 

ddduire au moins un discriminateur de 
coincidence (MT^is) ^ partlr d'au moins un des 
facteurs de coTncidence spectrale mentionnSs ci- 
dessus; 

fournir au moins un r^sultat maximum en 
appliquant, via les moyens de traitement, une 
fonction de rSsultat maximum au premier ensem- 
ble de donn6es, la fourniture d'au moins un r6sul- 
tat maximum comprenant les Stapes consistant 
d: 

fournir aux moyens de traitement 
des facteurs de poids (fm. fr. fa) pour les discrim'h 
nateurs de coTncidence, des hearts de temps de 
retention (RT<tev), et des hearts de surface et de 
hauteur (AHjov). respectivement, !e facteur de 
poids (f^ pour les discriminateurs de coTnciden- 
ce 6tant supSrieur au facteur de poids (ff) pour les 
hearts de temps de retention et au facteur de 
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poids (fa) pour les hearts de surface et de hau- 
teur; et 

appliquer la fonction de r^sultat 
maximum aux dtscriminateurs de coTncidence, 
aux hearts de temps de retention, et aux hearts 
de surface et de hauteur, conform^ment S t'^ga- 
Iit6: 

PS = ((fn.MT^) + (f,RT^) + (faAHd«))/NF 

oO PS est le r^sultat maximum, 
est un facteur de poids pour le discriminateur de 
coTncidence, f^ est un facteur de poids pour I'^cart 
de temps de r^tentbn, et f^ est un facteur de poids 
pour r^cart de surface et de hauteur, et NF est un 
facteur de normalisation d6duit de fa^on empiri- 
que; 

distinguer le premier compos6 chimique 
du second oompos6 chimique sur la t>ase du r6- 
sultat maximum (PS). 

2. Pn>c6d6 selon la revendication 1. caract6risd en 
ce que les discriminateurs de coTncidence (MTdts) 
sont d6duits conform^ment k I'^gaiitd: 

MTdb = DfT {Of, prob) 
oCi MTdis est le discriminateur de coTncidence, D 
est la diffi^rence pour le facteur de coTncidence 
moyen qui est d^duit d partir de fonctions d'auto- 
coTncidence etde coTncidence crois^e, DP reprd- 
sente les degr6s de liberty qui sont calculus d 
partir du nombre de spectres individuels pour les 
premier et second composes chimiques, et T (DF, 
prob) est la valeur t requise pour un degr^ de pro- 
babillt^ souhait^ (prod, en %) pour que deux 
moyennes qui different de la valeur t soient dif- 
f^rentes 6tant donn6 les degr6s de Iit)ert6 appli- 
es bles. 



ment aux absort>ances moyennes pour le mSme 
compose chimique aux m§mes longueurs 
d'onde. 

5. Proc6d6 selon la revendication 1 ou 2, caract6ri- 
sd en ce que la fonction de coTncidence spectrale 
est appliquSe aux premier ensemble de donn6es 
et aux absorfoances moyennes conform^ment d 



10 MF, = 1000(1 -r2) 

oCi MFx est un facteur d'autocoTncidence et r est 
un coefficient de correlation qui concerne les atv 
sorbances tndividuelles pour un des composes 
chimiques d des longueurs d'onde selection n6es 

IS relativement aux absorbances moyennes pour 

Tautre compost chimique aux mdmes longueurs 
d'onde. 

6. ProcM6 selon I'une quelconque des revendica- 
20 tions 3 d 5, caract^risd en ce que r est appliqud 

conform6ment d T^alitd: 
r = [(Sxy)-(Lx)(XyyntH(£x2-(5:x)2/nO(2y2- 
(Ey)2/n,)]i'2 

ou x et y, respectivement, sont les absorbances 
25 des premier et second composes chimiques d la 

meme longueur d'onde, ou, ou x et y, respective- 
ment, sont des absorbances Indh^iduelles et 
moyenn6es pour le m§me compost chimique d la 
m6me longueur d'onde, ou, oCi x et y, respective- 
30 ment, sont les absorbances individuelles pour un 

compost chimique et des absorbances moyen- 
n^es pour I'autre compost chimique S la m3me 
longueur d'onde, et ou £ est la fonction de som- 
me, et nf est le nombre de longueurs d'onde s6- 
35 lection n6es. 



3. Proc6d6 selon la revendication 1 ou 2, caract^ri- 
s6 en ce que la fonction de coTncidence spectrale 
est appliqute au premier ensemble de donn^es 
conform6ment d I'dgalit^: 40 

MFg = 1000 (1 - r2) 
ou MFg est un facteur de coTncidence globale et 
r est un coefficient de corr^ation qui concerne 
les absorfoances pour le premier compost chimi- 
que d des longueurs d'onde s6tectionn6es relati- 4S 
vement aux absorbances pour le second compo- 
st chimique aux mSmes longueurs d'onde. 

4. Proc6d6 selon la revendication 1 ou 2, caract^ri- 

s6 en ce que la fonction de coTncidence spectrale so 
est appliqute au premier ensemble de donn^es 
et aux absorbances moyennes conformdment d 
I'6galit6: 

MFa = 1000(1 -r2) 
ou MFa est un facteur d'autocoTncidence et r est 55 
un coefficient de correlation qui concerne les ab- 
sorbances individuelles d'un compost chimique 
d des longueurs d'onde s6lectionn6es relative- 



7. Precede selon I'une quelconque des revendica- 
tions 1 d 6, caract6ris6 par I'^tape qui oonsiste k 
preparer le premier ensemble de donn^es aprds 
avoir fourni ledit ensemble de donn6es aux 
moyens de traitement, dans lequel I'dtape 
consistent d preparer le premier ensemble de 
donn6es comprend retape conslstant 

seiectionner une partie de I'ensemble de 
donnees; et 

etalonner la partie seiectionnee. 

8. Precede selon I'une quelconque des revendica- 
tion 1 d 7, caracterise en ce que retape qui 
consiste d fournir au moins I'un desdits discrimi- 
nateurs de coTncidence (MTdb) comprend retape 
conslstant d appliquer, via les moyens de traite- 
ment, une fonction de discrimination de coTnch 
dence aux facteurs de coTncidence glot)ale. 

9. Precede selon I'une quelconque des revendica- 
tions 1^8. caracterise en ce que retape qui 
consiste d fournir au moins I'un desdits ecarts de 
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temps de retention (RTd^v) comprend I'^tape 
consistant d appliquer aux temps de retention, via 
les moyens de trartement, une fonction d*6cartde 
temps de retention conforme d T^galit^: 

RTdav = |RT1-RT2|/RT„,n 
ou RTdev est un 6cart de temps de retention, RTi 
reprdsente les temps de retention moyens pour le 
premier composant chimique, RT2 et la moyenne 
des temps de retention pour le second compost 
chimtque, et RTnm est une limite de variabilit6 
pour les temps de retention. 

10. Proc^6 selon I'une quelconque des revendica- 
tions 1 d 9, caract6ris6 par les stapes consistant 

d: 

fournir au molns un 6cart de surface maxi- 
mum (AR<tov) en appliquant, via les nrK>yens de 
trattement, une fonction d'dcart de surface maxi- 
mum aux surfaces maxima; et 

distinguer ^gatement le premier compost 
chimique du second compose chimique sur la 
base d'au moins une 6cart de surface maximum; 

dans lequel la fonction d'^cart de surface 
maximum est appliqu^e aux surfaces maxima 
conform6ment d I'^galit^: 

ARcto. = lARi-ARal/AR,,^ 
oCi ARdsv est r^cart de surface maximum, ARi est 
la moyenne des surfaces maxima pour le premier 
compose chimique, AR2 est la moyenne des sur- 
faces maxima pour le second composd chimique, 
et ARiim est une limite de variability pour la surfa- 
ce maximum. 



tent d appliquer. via les moyens de traitement, 
une fonction d'6cart de surface et de hauteur aux 
hearts de surface maximum et aux hearts de 
hauteur maximum, conform^ment d r^galit^: 
5 AHdev « (ARd«, + HT^)/2 

ou AHdev est un 6cart de surface et de hauteur, 
ARdmr est 6cart de surface maximum, et HT^ est 
un 6cart de hauteur maximum. 



11. Proc^d6 selon Tune quelconque des revendlca- 

tions 1 d 1 0, caract6ris6 par les stapes consistant 35 
d: 

fournir au moins une Scart de hauteur 
maximum (AR^) en appliquant, via les moyens 
de traitement, une fonction d'dcart de hauteur 
maximum aux hauteurs maxima; et 40 

distinguer dgalement le premier compost 
chimique du second composd chimique sur la 
t>ase d'au moins une 6cart de hauteur maximum; 

dans lequel la fonction d'^cart de hauteur 
maximum est appliqu^e aux hauteurs maxima 4S 
conform^ment ^ I'^galitd: 

HT^ = |HTi-HT2l/HT„„ 
ou HTdev est r^cart de hauteur maximum, HTi est 
la moyenne des hauteurs maxima pour le premier 
compost chimique. HT2 est la moyenne des hau- so 
teurs maxima pour le second compos6 chimique, 
et HTiin, est une limite de variability pour la hau- 
teur maximum. 



12. Proc^dy seton Tune quelconque des revendica- 55 
tions 1 d 11, caract^risy en ce que r^tape qui 
consiste d fournir au moins un 6cart de surface 
et de hauteur (AH,jev) comprend I'^tape consis- 
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PROGRAM MAKE LIBRARY 



I 

/ USER INPUT / ^'^' 



I 

/RETRIEVE RAW 

/ DA TA FILE / 




SELECT SIGNAL K 



102 
103 



FIND ALL PEAKST —^^O^ 
I 

i * PEAKS' >P H -'Q^ 



CREATE FILE FOR 
SAMPLE LIBRARY 

I 



7-106 



INITIALIZE 
COUNTER l->1 



106 
107 



FIND APEX SPECTRUM 
FOR PEAK i 



FIND APPROPRIATE 
REFERENCE SPECTRA 



DO BACKGROUND 
CORRECTION 
->PEAK SPECTRUM 



CALIBRATE WAVE- 
LENGTH AXIS 



6 

A 



-108 



-109 



'110 



III 




/TRANSFER SPECTRUM / 
/ TO LIBRARY FILE / -II3 



'TRANSFER PEAK 0ATA~7 
TO LIBRARY F ILE A- 

■115 



114 



NO 



^116 



rig. iO 
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SUBPROGRAM COMPARE LIBS 



200 



/GET USER INPUT iT 

/ RETRIEVE /— 
/ LIBRARY LI / 



I 



DETERMINE # 
OF PEAKS =>Pi 



I 



RETRIEVE 
LIBRARY L2 

I 




DETERMINE # 
OF PEAKS=>P2 

I 



CORRECT 
RETENTION TIMES 



I 



NORMALIZE 
AREA AND HEIGHT 



INITIALIZE COUNTERS 
. i->1.k=>0 



201 
202 
203 

204 
-205 

-206 
-207 
1—208 



/RETRIEVE DATA / 
/for PEAK 1 IN Ll/ \303 



I 



CALCULATE RET 
TIME WINDOW 
I 



INITIALIZE 
COUNTER j=>1 



"210 
-211 



/ RETRIEVE DATA 
/for PEAK] IN 

T 



YES^ 



A 



PEAK j INSIDE 
RET WINDOW? 



213 



CALCULATE 
MTdis. CN 



I 



.214 



CALCULATE 
DEVIATIONS 



YES 



-215 
'216 



YES/ MATCH BETTER \ N0 
\ THAN TABLE ?> C 



REMOVE ENTRY 
10 FROM TABLE 



217 
-218 



INSERT MATCH 
INTO TABLE 



'220 



j=j + 1 



NO 



222 



-221 
YES 



3 



224 



IIE^3-^223 

^ — I 



DO PEAK 
ASSIGNMENT •}— 225 



I 



calculate 
peak scores 

QnD 



-226 



no . // 
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USER INPUT 7 

OF LIBRARIES-> L A- 
pi 



301 



COPY LIBRARY \ 
TO TEMP 
I 



INITIALIZE 
COUNTER i°>? 



•302 
-303 



NO 



YES 



304 



COMPARE LIBS I 

LIB i-^>TEMP 
IT 

'306 
^307 



DELETE EXTRA 
PEAKS IN LIBi 

r 



INITIALIZE 
COUNTER j=>i-1 



NO 



308 



;/ ' 



YES 



DELETE PEAKS 
FROM LIBj - 

I 



J 
-309 



310 



DELETE PEAKS 
FROM TEMP 

H 



-311 



i = i+ 



T] ^31? 



SUBPROGRAM MAKE STD LIBRARY 

300 I COUNT PEAKS IN 1/^'^ 

TEMP =>P • 

314 



CREATE FILE FOR 
STANDARD LIBRARY 

/ INITIALIZE 7 
/COUNTER i'>1/ ^3l5 



/STORE SPECTRUM i FOR / 
/ ALL LIBS TO STDLIB A 3I6 
I 

/STORE AVERAGE SPECTRUN/ 
/ TO STDLIB A.3I 7 

/STORE INFO FOR PEAK i / 
/ FOR ALL LIBS TO STDLIBA 
— I 318 

/STORE AVERAGE PEAK / 
/ INFO TO STDLIB A.. 



CALCULATE Ma 
FOR ALL SPECTRA 



320 



/STORE Ma / 
/TO STDLIB A ^?i 



J]^322 



NO 



< i>P? > 



YES 



323 



GnD 



rig. §2 
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MAIN PROGRAM GET SAMPLE SCORE 



/GET U SER INPUT/- 401 

JES/ STANDARD LIBRARY SI FOR KnO*^^ 
\R1 REPLICATES AVAILABLE?/ — I 



/ INPUT* OF Z7 
/REPLICATES =>R1 7^403 



I 



INITIALIZE 
COUNTER l = >1 



-404 



/ RETRIEVE RAW 
/data FORSAMPIF/ ^405 



MAKE LIBRARY 1 1 — ^qB 
□Hlin 407 



408 



YES 



MAKE STO , ... 
LIBRARY Sl-T" ^^9 



STANDARD. LIBRARY S2 ""OR^ 



R2 REPLICATES AVAIL ARIF" 



/ INPUT* OF 



410 



^REPLICATES °>R2 



INITIALIZE 
COUNTER j-H- 



T 



411 
412 



i 



413 



/Btrieve raw ' / 
/data for sample.;/ 

i ^414 



MAKE LIBRARYjt ^ 



YES 



416 



MAKE STD 
LIBRARY S2 



COMPARE LIBS 
S1<>S2 ' 



-417 



418 



OUTPUT 7 
TEAK S C 0RESA 4I9 



CALCULATE 
SAMPLE SCORE 



-420 



/OUTPUT ~7 
/SAMPLE SCORE , / ^4?l 



/OUTPUT FINAL REPORT/ 



(enD 



422 
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